Analyzing the Escherichia coli Gene Expression Data by a Multilayer Adjusted Tree Organizing Map

نویسندگان

  • Ning Wei
  • Le Gruenwald
  • Tyrrell Conway
چکیده

Using the DNA microarray technology, biologists have thousands of array data available. Discovering the function relations between genes and their involvements in biological processes depends on the ability to efficiently process and quantitatively analyze large amounts of array data. Clustering algorithms are among the popular tools that can be used to help biologists achieve their goals. Although some existing research projects employed clustering algorithms on biological data, none of them has examined the Escherichia coli (E. coli) gene expression data. This paper proposes a clustering algorithm called Multilayer Adjusted Tree Organizing Map (MATOM) to analyze the E. coli gene expression data. In a semi-supervised manner, MATOM constructs a multilayer map, and at the same time, removes noise data in the previously trained maps in order to improve the training process. This paper then presents the clustering results produced by MATOM and other existing clustering algorithms using the E. coli gene expression data, and a new evaluation method to assess them. The results show that MATOM performs the best in terms of percentage of genes that are clustered correctly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cloning and sequencing of ompf Salmonella typhi Salmonella ompf gene in Escherichia coli Origami

Background and Aim: Salmonella Typhi belongs to the family Enterobacteriaceae, gram-negative bacilli and causes gastrointestinal diseases such as typhoid. This bacterium has a special structure and various genes, including the ompf gene (outer membrane protein). Recent studies have shown the possibility of using ompf in the development of a diagnostic tuberculosis vaccine. Therefore, the aim of...

متن کامل

Gene Expression Data Mining for Functional Genomics

Methods for supervised and unsupervised clustering and machine learning were studied in order to automatically model relationships between gene expression data and gene functions of the microorganism Escherichia coli. From a pre-selected subset of 265 genes (belonging to 3 functional groups) the function has been predicted with an accuracy higher than 50 % by various data mining methods describ...

متن کامل

Gene Expression Data Mining for Functional Genomics using Fuzzy Technology

Methods for supervised and unsupervised clustering and machine learning were studied in order to automatically model relationships between gene expression data and gene functions of the microorganism Escherichia coli. From a pre-selected subset of 265 genes (belonging to 3 functional groups) the function has been predicted with an accuracy of 63-71 % by various data mining methods described in ...

متن کامل

A COMPARATIVE STUDY BETWEEN EXPRESSION OF A SYNTHETIC GENE OF HUMAN BASIC FIBROBLAST GROWTH FACTOR (hbFGF) AND ITS RELATED cDNA IN ESCHERICHIA COLI

The gene encoding the human basic fibroblast growth factor (hbFGF) has been already chemically-synthesized and cloned in pET-3a expression vector (Pasteur Institute of Iran). In the present study, we compared the level of expression of this synthetic hbFGF and its related cDNA in Escherichia coli. The pBR322-cDNA of hbFGF supplied by Dr. Seno (from Molecular Biology Dept, Okaido prefectural uni...

متن کامل

Synthesis and Expression of Modified bFGF Gene in Escherichia coli Cells

A new strategy for construction of synthetic gene encoding human basic fibroblast growth factor comprising DNA annealing-ligation and augmentation by polymerase chain reaction was introduced. The sequence of the gene and corresponding amino acid chain were modified in order to increase stability of the protein. First, 300 bp and 160 bp fragments of the gene were assembled from 18 oligonucleotid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003